DeepPurple: Lexical, String and Affective Feature Fusion for Sentence-Level Semantic Similarity Estimation

نویسندگان

  • Nikos Malandrakis
  • Elias Iosif
  • Vassiliki Prokopi
  • Alexandros Potamianos
  • Shrikanth S. Narayanan
چکیده

This paper describes our submission for the *SEM shared task of Semantic Textual Similarity. We estimate the semantic similarity between two sentences using regression models with features: 1) n-gram hit rates (lexical matches) between sentences, 2) lexical semantic similarity between non-matching words, 3) string similarity metrics, 4) affective content similarity and 5) sentence length. Domain adaptation is applied in the form of independent models and a model selection strategy achieving a mean correlation of 0.47.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DeepPurple: Estimating Sentence Semantic Similarity using N-gram Regression Models and Web Snippets

We estimate the semantic similarity between two sentences using regression models with features: 1) n-gram hit rates (lexical matches) between sentences, 2) lexical semantic similarity between non-matching words, and 3) sentence length. Lexical semantic similarity is computed via co-occurrence counts on a corpus harvested from the web using a modified mutual information metric. State-of-the-art...

متن کامل

EmotiWord: Affective Lexicon Creation with Application to Interaction and Multimedia Data

We present a fully automated algorithm for expanding an affective lexicon with new entries. Continuous valence ratings are estimated for unseen words using the underlying assumption that semantic similarity implies affective similarity. Starting from a set of manually annotated words, a linear affective model is trained using the least mean squares algorithm followed by feature selection. The p...

متن کامل

Kernel Models for Affective Lexicon Creation

Emotion recognition algorithms for spoken dialogue applications typically employ lexical models that are trained on labeled in-domain data. In this paper, we propose a domainindependent approach to affective text modeling that is based on the creation of an affective lexicon. Starting from a small set of manually annotated seed words, continuous valence ratings for new words are estimated using...

متن کامل

Developing a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity

Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...

متن کامل

BIOSSES: a semantic sentence similarity estimation system for the biomedical domain

Motivation The amount of information available in textual format is rapidly increasing in the biomedical domain. Therefore, natural language processing (NLP) applications are becoming increasingly important to facilitate the retrieval and analysis of these data. Computing the semantic similarity between sentences is an important component in many NLP tasks including text retrieval and summariza...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013